Censoring weighted separate-and-conquer rule induction from survival data.

نویسندگان

  • Ł Wróbel
  • M Sikora
چکیده

OBJECTIVES Rule induction is one of the major methods of machine learning. Rule-based models can be easily read and interpreted by humans, that makes them particularly useful in survival studies as they can help clinicians to better understand analysed data and make informed decisions about patient treatment. Although of such usefulness, there is still a little research on rule learning in survival analysis. In this paper we take a step towards rule-based analysis of survival data. METHODS We investigate so-called covering or separate-and-conquer method of rule induction in combination with a weighting scheme for handling censored observations. We also focus on rule quality measures being one of the key elements differentiating particular implementations of separate-and-conquer rule induction algorithms. We examine 15 rule quality measures guiding rule induction process and reflecting a wide range of different rule learning heuristics. RESULTS The algorithm is extensively tested on a collection of 20 real survival datasets and compared with the state-of-the-art survival trees and random survival forests algorithms. Most of the rule quality measures outperform Kaplan-Meier estimate and perform at least equally well as tree-based algorithms. CONCLUSIONS Separate-and-conquer rule induction in combination with weighting scheme is an effective technique for building rule-based models of survival data which, according to predictive accuracy, are competitive with tree-based representations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

J-PMCRI: A Methodology for Inducing Pre-pruned Modular Classification Rules

Inducing rules from very large datasets is one of the most challenging areas in data mining. Several approaches exist to scaling up classification rule induction to large datasets, namely data reduction and the parallelisation of classification rule induction algorithms. In the area of parallelisation of classification rule induction algorithms most of the work has been concentrated on the Top ...

متن کامل

Parallel Rule Induction with Information Theoretic Pre-Pruning

In a world where data is captured on a large scale the major challenge for data mining algorithms is to be able to scale up to large datasets. There are two main approaches to inducing classification rules, one is the divide and conquer approach, also known as the top down induction of decision trees; the other approach is called the separate and conquer approach. A considerable amount of work ...

متن کامل

Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks

In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classification rule induction, parallelisation of classification rules has focused on the di...

متن کامل

A Hyper-Heuristic for Descriptive Rule Induction

Rule induction from examples is a machine learning technique that finds rules of the form condition → class, where condition and class are logic expressions of the form variable1 = value1 ∧ variable2 = value2 ∧... ∧ variablek = valuek. There are in general three approaches to rule induction: exhaustive search, divide-and-conquer, and separateand-conquer (or its extension as weighted covering). ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Methods of information in medicine

دوره 53 2  شماره 

صفحات  -

تاریخ انتشار 2014